---
title: Databricks
sidebarTitle: Databricks
---

This is the implementation of the Databricks data handler for MindsDB.

[Databricks](https://databricks.com/) is a data lakehouse that unifies the best of data warehouses and data lakes in one simple platform to handle all your data, analytics, and AI use cases. It’s built on an open and reliable data foundation that efficiently handles all data types and applies one common security and governance approach across all of your data and cloud platforms.

## Prerequisites

Before proceeding, ensure the following prerequisites are met:

1. Install MindsDB [locally via Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or use [MindsDB Cloud](https://cloud.mindsdb.com/).
2. To connect Databricks to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies).
3. Install or ensure access to Databricks.

## Implementation

This handler is implemented using `databricks-sql-connector`, a Python library that allows you to use Python code to run SQL commands on Databricks clusters and Databricks SQL warehouses.

The required arguments to establish a connection are as follows:

* `server_hostname` is the server hostname for the cluster or SQL warehouse.
* `http_path` is the HTTP path of the cluster or SQL warehouse.
* `access_token` is a Databricks personal access token for the workspace.

There are several optional arguments as follows:

* `session_configuration` is a dictionary of Spark session configuration parameters.
* `http_headers` stores additional (key, value) pairs to set in HTTP headers on every RPC request the client makes.
* `catalog` is the catalog to use for the connection. Typically, defaults to `hive_metastore` if not provided.
* `schema` is the schema (database) to use for the connection. Defaults to `default` if not provided.

## Usage

In order to make use of this handler and connect to the Databricks database in MindsDB, the following syntax can be used:

```sql
CREATE DATABASE databricks_datasource
WITH
  engine = 'databricks',
  parameters = {
      "server_hostname": "adb-1234567890123456.7.azuredatabricks.net",
      "http_path": "sql/protocolv1/o/1234567890123456/1234-567890-test123",
      "access_token": "dapi1234567890ab1cde2f3ab456c7d89efa",
      "schema": "example_db"
  };
```

You can use this established connection to query your table as follows:

```sql
SELECT *
FROM databricks_datasource.example_table;
```
